<html>
<head>
  <title>XML flattener tutorial</title>
  <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
</head>
<body>

<table width="100%" align="center">
  <td><img src="../images/titolo_mint.jpg" border=0></td>

  <td><h1>XML Flattener tutorial: how to create a flat file from a 
  PSI XML file</h1></td>
</table>


<p>The purpose of this tutorial is is to provide a step by step guide 
for the user who wants to extract information from a 
<a href="http://psidev.sourceforge.net/">PSI</a> XML file and 
produce  an output in a flat file format.
In the output file each line will describe a protein interaction.</p>

<ol>
<li><h2>Open a schema</h2>
<p>The first step consists in the loading of the PSI schema and it is achieved by 
clicking the “open a schema” button. The file is called MIF.xsd and 
is available at the PSI web page. You can also find it in the directory data.
Once the schema is loaded, the root node should be displayed in the main frame. It 
is named entrySet. Some nodes are colored in <font color="red">red</font>, indicating
that this element in the PSI document should never be empty according to the XML schema.
</p>
</li>  

<li><h2>Open an XML (PSI) document</h2>
<p>You are now ready to load the PSI document. This step can take some time as 
the application is checking the file.</p>
</li>

<li><h2>Node describing a line</h2>
<p>Since we want each line of the flat file to describe an interaction, we select the node 
<b>interaction</b> (entrySet, entry, interactionList, interaction) and click on 
<b>node describing 
a line</b>. This node is now colored in <font color="blue">blue</font>. </p>
<p><i>If we had not done this selection, the 
application would have looked by itself for the node supposed to be the most
representative of a line once we had began to select some nodes (it would be looking for
the last duplicable node that groups all selected nodes).</i></p>
</li>

<li><h2>Selecting the nodes</h2>
<p>Now we can choose the elements we want to have in the flat file. For example if we 
want the <b>short label</b> of the interaction, we select the node <b>shortLabel</b>
(interaction, names, shortLabel)
and press the button <b>select this node</b>. The node is now colored <b>black</b>. 
<b>[shortLabel]</b> should now appear in the frame titled 
<b>preview</b> indicating that the flat file will contain 
one column 
for the shortLabel. Next if for instance we want the shortLabel of the interactor to be present
 in the flat file, we click on 
participantList, proteinParticipant, (proteinInteractorRef|proteinInteractor) that is
automaticaly expanded, proteinInteractor, names and shortLabel, and press select this node.</p> 
<p><i>The flattener will be looking down in the document for the maximum number of 
interactors participating in an interaction described in the specific XML file. 
If the largest interaction involves  ten proteins, it will add
ten columns for the short labels. The references are also taken into account, 
so even if the PSI document is normalized, the flattener will go deep to the references
to find the short label of the referenced interactors. In fact, we should
never select a reference that does not contain any real information
but let the application get down to it as if the file was not normalized.</p></i>
</li>

<li><h2>Choose the separator</h2>
<p>Before printing the flat file we can choose the fields <b>separator</b> (for example |, ; or ,).</p>
</li>

<li><h2>Print the flat file</h2>
<p>We can finally print the flat file by pressing the button <b>print</b>.</p>
</li>

</ol>


<h3><a name="contact">Contacts</a></h3></li>
	<p>This software has been created at the University of Roma "Tor Vergata"</b> by Arnaud Ceol
	with help of the <a href="http://cbm.bio.uniroma2.it/mint/">Mint Group</a>.
	For any information you can contact me at 
	<a href="mailto:arnaud@cbm.bio.uniroma2.it">arnaud@cbm.bio.uniroma2.it</a>.</p>
	<p>PSI: the <a href="http://psidev.sourceforge.net/">Proteomics Standards Initiative</a></p>
 
</body>
</html>
